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Abstract 

In our earlier work on the development of a model-independent data analysis method 
for reconstructing the (moments of the) time-averaged one-dimensional velocity distribu- 
tion function of Weakly Interacting Massive Particles (WIMPs) by using measured recoil 
energies from direct Dark Matter detection experiments directly, it was assumed that the 
analyzed data sets are background-free, i.e., all events are WIMP signals. In this article, as 
a more realistic study, we take into account a fraction of possible residue background events, 
which pass all discrimination criteria and then mix with other real WIMP-induced events 
in our data sets. Our simulations show that, for the reconstruction of the one-dimensional 
WIMP velocity distribution, the maximal acceptable fraction of residue background events 
in the analyzed data set(s) of 0(500) total events is ~ 10% - 20%. For a WIMP mass of 
50 GeV with a negligible uncertainty and 20% residue background events, the deviation 
of the reconstructed velocity distribution would in principle be ~ 7.5% with a statistical 
uncertainty of ~ 18% (~ 19% for a background-free data set). 



1 Introduction 



Currently, direct Dark Matter detection experiments searching for Weakly Interacting Massive 
Particles (WIMPs) are one of the promising methods for understanding the nature of Dark 
Matter and identifying them among new particles produced at colliders as well as reconstructing 
the (sub) structure of our Galactic halo [1, 2, 3, 4]. In our earlier work [5], we developed methods 
for reconstructing the (moments of the) time-averaged one-dimensional velocity distribution of 
halo WIMPs by using (a functional form of) the recoil spectrum as well as the measured recoil 
energies directly. This analysis requires no prior knowledge about the WIMP density near the 
Earth nor about their scattering cross section on nucleus, the unique required information is the 
mass of incident WIMPs. We therefore turned to also develop the method for determining the 
WIMP mass model-independently by combining two experimental data sets with two different 
target nuclei [6, 7]. 

In the work on the development of these model-independent data analysis procedures by us- 
ing measured recoil energies from direct detection experiments directly, it was assumed that the 
analyzed data sets are background-free, i.e., all events are WIMP signals. Active background 
discrimination techniques should make this condition possible. For example, the ratio of the 
ionization to recoil energy, the so-called "ionization yield", used in the CDMS-II experiment 
provides an event-by-event rejection of electron recoil events to be better than 10^"^ misiden- 
tification [8]. By combining the "phonon pulse timing parameter", the rejection ability of the 
misidentified electron recoils (most of them are "surface events" with sufficiently reduced ioniza- 
tion energies) can be improved to be < 10~^ for electron recoils [8]. Moreover, as demonstrated 
by the CRESST collaboration [9], by means of inserting a scintillating foil, which causes some 
additional scintillation light for events induced by a-decay of ^^°Po and thus shifts the pulse 
shapes of these events faster than pulses induced by WIMP interactions in the crystal, the pulse 
shape discrimination (PSD) technique can then easily distinguish WIMP-induced nuclear recoils 
from those induced by backgrounds^. 

However, as the most important issue in all underground experiments, the signal identification 
ability and possible residue background events which pass all discrimination criteria and then mix 
with other real WIMP-induced events in our data sets should also be considered. Therefore, in 
this article, as a more realistic study, we follow our first work on the effects of residue background 
events on the determination of the WIMP mass [13] and want to study how well we could 
reconstruct the WIMP velocity distribution model-independently by using "impure" data sets 
and how "dirty" these data sets could be to be still useful. 

The remainder of this article is organized as follows. In Sec. 2 I review the model-independent 
method for reconstructing the time-averaged one-dimensional velocity distribution function of 
halo WIMPs by using data from direct detection experiments directly. In Sec. 3 the effects of 
residue background events in the analyzed data sets on the measured energy spectrum as well as 
on the reconstructed WIMP mass will briefly be discussed. In Sec. 4 I show numerical results of 
the reconstructed WIMP velocity distribution by using mixed data sets with different fractions 
of residue background events based on Monte Carlo simulations. I conclude in Sec. 5. Some 
technical details will be given in an appendix. 

^For more details about background discrimination techniques and status in currently running and projected 
direct detection experiments, see e.g., Refs. [10, 11, 12] 



2 Methods for reconstructing the one— dimensional WIMP 
velocity distribution function 



In this section I review briefly the methods for reconstructing the one-dimensional WIMP ve- 
locity distribution function from the recoil spectrum as well as from experimental data directly. 
Detailed derivations and discussions can be found in Refs. [5, 14]. 



2.1 From the recoil spectrum 

The basic expression for the differential event rate for elastic WIMP-nucleus scattering is given 
by [3]: 



fiiv) 



dv , 



(1) 



Here R is the direct detection event rate, i.e., the number of events per unit time and unit mass 
of detector material, Q is the energy deposited in the detector, F{Q) is the clastic nuclear form 
factor, fi{v) is the one-dimensional velocity distribution function of the WIMPs impinging on 
the detector, v is the absolute value of the WIMP velocity in the laboratory frame. The constant 
coefficient A is defined as 



A = 



(2) 



where po is the WIMP density near the Earth and ctq is the total cross section ignoring the form 
factor suppression. The reduced mass mr,N is defined by 



mr,N 



m^mN 



(3) 



where is the WIMP mass and mN that of the target nucleus. Finally, Vmin is the minimal 
incoming velocity of incident WIMPs that can deposit the energy Q in the detector: 



^min ^yQ 

with the transformation constant 



a = 



2m2 



(4) 



(5) 



and Vmax is the maximal WIMP velocity in the Earth's reference frame, which is related to the 
escape velocity from our Galaxy at the position of the Solar system, Vesc ^ 600 km/s. 

In our earlier work [5], it was found that, by using a time-averaged recoil spectrum dR/dQ 
and assuming that no directional information exists, the normalized one-dimensional velocity 
distribution function of incident WIMPs, fi{v), can be solved from Eq. (1) directly as 
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where the normalization constant M is given by 

dQ\ . 



' dR' 



(7) 



Here the integral goes over the entire physically allowed range of recoil energies: starting at 
Q = Oj and the upper limit of the integral has been written as oo. Note that, because fi{v) in 
Eq. (6) is the normalized velocity distribution, the normalization constant A/" here is independent 
of the constant coefficient A defined in Eq. (2). Moreover, as the most important consequence, 
the velocity distribution function of halo WIMPs reconstructed by Eq. (6) is independent of the 
local WIMP density po as well as of the WIMP-nucleus cross section o-q. However, as we will see 
later, not only the overall normalization constant Af given in Eq. (7), but also the shape of the 
velocity distribution, through the transformation Q = jo? in Eq. (6), depends on the WIMP 
mass involved in the coefficient a defined in Eq. (5). 



2.2 From experimental data directly 

In order to use the expressions (6) and (7) for reconstructing ), one needs a functional form 
for the recoil spectrum dR/dQ. In practice this requires usually a fit to experimental data. 
However, data fitting will re-introduce some model dependence and make the error analysis 
more complicated. Hence, expressions that allow to reconstruct /i {v) directly from experimental 
data (i.e., measured recoil energies) have also been developed [5]. We started by considering 
experimental data described by 

Qn-^-f <Qn,i<Qn+^, i ^ 1, 2, ■ ■ ■ , N^, U ^ 1, 2, ■ ■ ■ , B . (8) 

Here the entire experimental possible energy range between Qmin and Qmax has been divided 
into B bins with central points Q„ and widths &„. In each bin, Nn events will be recorded. 
As argued in Ref . [5] , the statistical uncertainty on the "slope of the recoil spectrum" , 




appearing in the expression (6), scales like the bin width to the power —1.5. In addition, the 

wider the bin width, the more the recorded events in this bin, and thus the smaller the statistical 
uncertainty on the estimator of [d/dQ {dR/dQ)]Q^Q^^. Hence, since the recoil spectrum dR/dQ 
is expected to be approximately exponential, in order to approximate the spectrum in a rather 
wider range, instead of the conventional standard linear approximation, the following exponential 
ansatz for the measured recoil spectrum {before normalized by the exposure S) in the nth bin 
has been introduced [5]: 

/dR\ ^f^\ =r„e*^»(^-'2^-"). (9) 

V / expt, n \dQJ expt, Q~Q„ 

Here is the standard estimator for {dR/dQ)expt at Q = Qn'- 
N 

On 

kn is the logarithmic slope of the recoil spectrum in the nth Q— bin, which can be computed 
numerically from the average value of the measured recoil energies in this bin: 



where 



1 



{Q - QnY\n = TT E (Qn,i - Qnf ■ (12) 



Then the shifted point Qs,n in the ansatz (9), at which the leading systematic error due to the 
ansatz is minimal [5] , can be estimated by 



Qs,n — Qn I 



(13) 



Note that Qs,n differs from the central point of the nth bin, Qn. 

Now, substituting the ansatz (9) into Eq. (6) and then letting Q — Qs,n, we can obtain that 
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Here 



Vs,n^ (^\lQs,n, (15) 

and the normalization constant M given in Eq. (7) can be estimated directly from the data: 

1 
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where the sum runs over all events in the sample. 



(16) 



2.3 Windowing the data set 

As mentioned above, the statistical uncertainty on the slope of the recoil spectrum around 
the central point Qn, [d/ dQ {dR/ dQ)]Q^Q^, is approximately proportional to h~^-^. Thus, in 
order to reduce the statistical uncertainty on the reconstructed velocity distribution function by 
Eq. (14), it seems to be better to use large bin width. However, neither the conventional linear 
approximation: 

i'-^] = ^ (17) 

nor the exponential ansatz given in Eq. (9) can describe the real (but as yet unknown) recoil 
spectrum exactly. The neglected terms of higher powers oi Q — Qn could therefore induce some 
uncontrolled systematic errors which increase with increasing bin width. Moreover, since the 
number of bins scales inversely with their size, by using large bins we would be able to estimate 
fi{v) only at a small number of velocities. Additionally, once a quite large bin width is used, 
it would correspondingly lead to a quite large value of the first reconstructible point of fi{v), 
i.e., fi,Tec{vs,i), since the central point Qi as well as the shifted point Qs,i of the first bin would 
be quite large. Finally, choosing a fixed bin size, as one conventionally does, would let errors 
on the estimated logarithmic slopes, and hence also on the estimates of fi{v), increase quickly 
with increasing Q ot v. This is due to the essentially exponential form of the expected recoil 
spectrum, which would lead to a quickly falling number of events in equal-sized bins. By some 
trial-and-error analyses it was found that the errors are roughly equal in all bins if the bin 
widths increase linearly [5]. 

Therefore, it has been introduced in Ref. [5] that one can first collect experimental data in 
relatively small bins and then combining varying numbers of bins into overlapping "windows" . 
In particular, the first window would be identical with the first bin. One starts by binning the 
data, as in Eq. (8), where the bin widths satisfy 



bn^bi + {n- 1)S , 



(18) 



i.e., 



Qn Qmin ~l~ 



Here the increment 5 satisfies 
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(19) 



5 = -^^^ (<5max - Qmin - Bh) , (20) 

B being the total number of bins, and Q(min,max) are the experimental minimal and maximal 
cut-off energies. Assume up to nw bins are collected into a window, with smaller windows at 
the borders of the range of Q. 

In order to distinguish the numbers of bins and windows, hereafter Latin indices n, m, ■ ■ ■ 
are used to label bins, and Greek indices fi, ■ ■ ■ to label windows. For 1 < /i < nw, the fith 
window simply consists of the first /x bins; for nw < IJ^ < B, the /ith window consists of bins 
— nw + 1, A* — + 2, ■ ■ ■ , n; and for B < ^ < B + nw — 1, the /xth window consists of 
the last nw — {/J' — B) bins. This can also be described by introducing the indices and 
which label the first and last bin contributing to the /ith window, with 

. for fi<nw, . 

= <^ (21a) 




for II > nw, 



and 




^, for//<S, 
""^"^ " for;,>B. 

The total number of windows defined through Eqs. (21a) and (21b) is evidently W = B + nw — ^, 
i.e., 1 < IJ> < B + nw — 1- 

As shown in the previous subsection, the basic observables needed for the reconstruction of 

fi{v) by Eq. (14) are the number of events in the nth Q— bin, A^„, as well as the average value 
of the measured recoil energies in this bin, Q — Qn\n- For a "windowed" data set, one can easily 
calculate the number of events per window as 

A^M= E (22) 
as well as the average value of the measured recoil energies 

^^^U = 1^ I E ^n^U I - ' (23) 

where is the central point of the ^th window. The exponential ansatz in Eq. (9) is now 
assumed to hold over an entire window. We can then estimate the prefactor as 

N 

r, = -^, (24) 

being the width of the /ith window. The logarithmic slope of the recoil spectrum in the fith 
window, kfj, as well as the shifted point Q,^ ^ (from the central point of each "window", Q^) can 
be calculated as in Eqs. (11) and (13) with "bin" quantities replaced by "window" quantities. 
Note that, due to the combination of bins into overlapping windows, these quantities are all 



correlated (for nw 7^ 1)- The expressions for estimating the covariance matrices are given in the 
appendix. 

Finally, the covariance matrix of the estimates of fi{v) at adjacent values of Vs,^ — a^Qs,^ 
is given by 
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(25) 



Note that Eq. (25) should in principle also include contributions involving the statistical error 

on the estimator for M in Eq. (16). However, this error and its correlations with the errors on 
the and A;^ have been found to be negligible compared to the errors included in Eq. (25) [5]. 



3 Effects of residue background events 



In this section I first show some numerical results of the energy spectrum of WIMP recoil signals 
mixed with a few background events. Then I review the effects of residue background events in 
the analyzed data sets on the reconstruction of the WIMP mass m^. 

For generating WIMP-induced signals, we use the shifted Maxwellian velocity distribution 
[2, 3, 5]: 



(26) 



Here vq ^ 220 km/s is the orbital velocity of the Sun in the Galactic frame, and is the Earth's 
velocity in the Galactic frame [15, 3, 4]^: 



1.05 + 0.07 COS 



'2TT{t 



1 yr 



(27) 



with tp ~ June 2nd is the date on which the velocity of the Earth relative to the WIMP halo 
is maximal. As a maximal cut-off of the velocity distribution function, the escape velocity has 
been set as Vcsc = 700 km/s. The Woods-Saxon elastic nuclear form factor [16, 3, 4] for the SI 
WIMP-nucleus cross section will also be used^. 

Meanwhile, we use the target-dependent exponential form introduced in Ref. [13] for the 
residue background spectrum: 




exp 



bg,ex 



g/keV \ 
^0.6 j 



(28) 



Here Q is the recoil energy, A is the atomic mass number of the target nucleus. The power index 
of A, 0.6, is an empirical constant, which has been chosen so that the exponential background 
spectrum is somehow similar to, but still different from the expected recoil spectrum of the target 
nuclei; otherwise, there is in practice no difference between the WIMP scattering and background 



^The time dependence of the Earth's velocity in the Galactic frame, the second term of Ve{t), will be ignored 
in our simulations, i.e., Vc = 1.05 uo will be used. 

■^Other commonly used analytic forms for the one-dimensional WIMP velocity distribution as well as for the 
elastic nuclear form factor for the SI WIMP-nucleus cross section can be found in Refs. [5, 17]. 



spectra. Note that, among different possible choices, we use in our simulations the atomic mass 
number A as the simplest, unique characteristic parameter in the general analytic form (28) for 
defining the residue background spectrum for different target nuclei. However, it does not mean 
that the (superposition of the real) background spectra would depend simply/primarily on A 
or on the mass of the target nucleus, m-M. In other words, it is practically equivalent to use 
expression (28) or {dR/dQ\^^^^ = ^-Qn^-^ kev directly for a ^^Ge target. 

Note also that, firstly, as argued in Ref. [13], the exponential form of background spectrum 
is rather naive; but, since we consider here only a few (tens) residue background events induced 
by perhaps two or more different sources, pass all discrimination criteria, and then mix with 
other WIMP-induced events in our data sets of a few hundreds total events, exact forms of 
different background spectra are actually not very important and this exponential form should 
practically not be unrealistic^. Secondly, as demonstrated in Ref. [5] and reviewed in the previous 
section, the model-independent data analysis procedure for reconstructing the WIMP velocity 
distribution function requires only measured recoil energies (induced mostly by WIMPs and 
occasionally by background sources) from direct detection experiments. Therefore, for applying 
this method to future real data, a prior knowledge about (different) background source(s) is not 
required at all. 

Moreover, for our numerical simulations presented here as well as in the next section, the 
actual numbers of signal and background events in each simulated experiment are Poisson- 
distributed around their expectation values independently; and the total event number recorded 
in one experiment is then the sum of these two numbers. Additionally, we assumed that all 
experimental systematic uncertainties as well as the uncertainty on the measurement of the 
recoil energy could be ignored. The energy resolution of most existing detectors is so good that 
its error can be neglected compared to the statistical uncertainty for the foreseeable future with 
pretty few events. 

3.1 On the measured energy spectrum 

In Figs. 1 I show measured energy spectra (solid red histograms) for a ^^Ge target with six 
different WIMP masses: 10, 25, 50, 100, 250, and 500 GeV based on Monte Carlo simulations. 
The dotted blue curves are the elastic WIMP-nucleus scattering spectra, whereas the dashed 
green curves are the exponential background spectra given in Eq. (28), which have been normal- 
ized so that the ratios of the areas under these background spectra to those under the (dotted 
blue) WIMP scattering spectra are equal to the background-signal ratio in the whole data sets 
(e.g., 20% backgrounds to 80% signals shown in Figs. 1). The experimental threshold energy 
has been assumed to be neghgible and the maximal cut-off energy is set as 100 keV. The back- 
ground windows (the possible energy ranges in which residue background events exist) have been 
assumed to be the same as the experimental possible energy ranges. 5,000 experiments with 500 
total events on average in each experiment have been simulated. Remind that the measured 
energy spectra shown here are averaged over the simulated experiments. Five bins with linear 
increased bin widths have been used for binning generated signal and background events. As 
argued in Sec. 2.3, for reconstructing the one-dimensional WIMP velocity distribution function, 
this unusual, particular binning has been chosen in order to accumulate more events in high 
energy ranges and thus to reduce the statistical uncertainties in high velocity ranges. 

It can be found in Figs. 1 that, the shape of the WIMP scattering spectrum depends highly on 
the WIMP mass: for light WIMPs (m^ < 50 GeV), the recoil spectra drop sharply with increasing 

''Other (more realistic) forms for background spectrum (perhaps also for some specified targets/experiments) 
can be tested on the AMIDAS website [18, 19] . 
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Figure 1: Measured energy spectra (solid red histograms) for a ^^Ge target with six different 
WIMP masses: 10, 25, 50, 100, 250, and 500 GeV. The dotted blue curves are the elastic WIMP- 
nucleus scattering spectra, whereas the dashed green curves are the exponential background 
spectra normalized to fit to the chosen background ratio, which has been set as 20% here. 
The experimental threshold energy has been assumed to be negligible and the maximal cut-off 
energy is set as 100 keV. The background windows have been assumed to be the same as the 
experimental possible energy ranges. 5,000 experiments with 500 total events on average in each 
experiment have been simulated. See the text for further details (plots from Ref. [13]). 
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Figure 2: The reconstructed WIMP mass and the lower and upper bounds of the la statistical 
uncertainty by using mixed data sets from WIMP-induced and background events as func- 
tions of the input WIMP mass. ^^Si and ''^Ge have been chosen as two target nuclei. The 
background ratios shown here are no background (dashed green curves), 5% (dotted magenta 
curves), 10% (long-dotted blue curves), 20% (solid red curves), and 40% (dash-dotted cyan 
curves) background events in the analyzed data sets in the experimental energy ranges between 
and 100 keV. Each experiment contains 50 (upper) and 500 (lower) total events on average. 
Other parameters are as in Figs. 1. See the text for further details. 



recoil energies, while for heavy WIMPs {m^ > 100 GeV), the spectra become flatter. In contrast, 
the exponential background spectra shown here depend only on the target mass and are rather 
flatter (sharper) for light (heavy) WIMP masses compared to the WIMP scattering spectra. This 
means that, once input WIMPs are light (heavy), background events would contribute relatively 
more to high (low) energy ranges, and, consequently, the measured energy spectra would mimic 
scattering spectra induced by heavier (hghter) WIMPs. 

More detailed illustrations and discussions about the effects of residue background events on 
the measured energy spectrum can be found in Ref. [13]. 

3.2 On the reconstructed WIMP mass 

Figs. 2 show the reconstructed WIMP mass and the lower and upper bounds of the la statistical 
uncertainty by means of the model-independent procedure introduced in Refs. [6, 7] with mixed 
data sets from WIMP-induced and background events as functions of the input WIMP mass. 
As in Ref. [7], ^^Si and ^^Ge have been chosen as two target nuclei. The experimental threshold 
energies of two experiments have been assumed to be negligible and the maximal cut-off energies 
are set the same as 100 keV. The background windows arc set as the same as the experimental 
possible energy ranges for both experiments. The background ratios shown here are no back- 
ground (dashed green curves), 5% (dotted magenta curves), 10% (long-dotted blue curves), 20% 
(solid red curves), and 40% (dash-dotted cyan curves) background events in the analyzed data 
sets. 2 X 5,000 experiments have been simulated. Each experiment contains 50 (upper) and 
500 (lower) total events on average. Note that all events recorded in our data sets are treated 
as WIMP signals in the analysis, although statistically we know that a fraction of these events 
could be backgrounds. 

It can be seen clearly that, for hght WIMP masses {m^ ^100 GeV), due to the relatively flat- 
ter background spectrum (compared to the scattering spectrum induced by hght WIMPs), the 
energy spectrum of all recorded events would mimic a scattering spectrum induced by WIMPs 
with a relatively heavier mass, and, consequently, the reconstructed WIMP masses as well as the 
statistical uncertainty intervals could be overestimated. In contrast, for heavy WIMP masses 
(m^ > 100 GeV), due to the relatively sharper background spectrum, relatively more back- 
ground events contribute to low energy ranges, and the energy spectrum of all recorded events 
would mimic a scattering spectrum induced by WIMPs with a relatively lighter mass. Hence, 
the reconstructed WIMP masses as well as the statistical uncertainty intervals could be under- 
estimated. Moreover, Figs. 2 show that the larger the fraction of background events in the data 
sets, the more strongly over-/underestimated the reconstructed WIMP masses as well as the 
statistical uncertainty intervals. Nevertheless, from Figs. 2 it can be found that, with ~ 10% 
residue background events in the analyzed data sets of ~ 500 total events, one could still esti- 
mate the WIMP mass pretty well; if WIMPs arc light {m^ < 200 GeV), the maximal acceptable 
fraction of residue background events could even be as large as ~ 20%. 

More detailed illustrations and discussions about the effects of residue background events on 
the determination of the WIMP mass can be found in Ref. [13] . 

4 Results of the reconstructed one— dimensional WIMP 
velocity distribution function 

In this section I present simulation results of the reconstructed one-dimensional velocity dis- 
tribution function of halo WIMPs with mixed data sets from WIMP-induced and background 



events by means of the model-independent method described in Sec. 2.^ The WIMP mass rriy^ 
involved in the coefficient a in Eqs. (15) and (16) for estimating the reconstructed points Vg^n 
(or Vs,fi for a windowed data set) as well as the normalization constant A/" has been assumed to 
be known precisely with a negligible uncertainty from other (e.g., collider) experiments or can 
be determined from ot/ier direct detection experiments. As in Ref. [5], a ^^Ge nucleus has been 
chosen as our detector target for reconstructing fi{v); while a ^^Si target and a second ^^Ge 
target have been used for determining m^. The experimental threshold energy of each experi- 
ment has been assumed to be negligible and the maximal cut-off energies are set the same as 
100 keV. The exponential background spectrum given in Eq. (28) has been used for generating 
background events in windows of the entire experimental possible ranges As in Figs. 1, five bins 
have been used^ and up to three bins have been combined to a window. (3 x) 5,000 experiments 
have been simulated. 

4.1 With a precisely known WIMP mass 

In this subsection we first assume that the required WIMP mass for determining the shape of 
the reconstructed velocity distribution through the transformation (15) from Qg ^ to v^^n (or 
from Qs^^ to Vg^n) and for estimating the normalization constant J\f by Eq. (16) has been known 
precisely with a negligible uncertainty. 

Figs. 3 show the one-dimensional WIMP velocity distribution function reconstructed by 
Eq. (14) with data sets of 500 total events on average for six different WIMP masses: 10, 25, 50, 
100, 250, and 500 GeV; all events in the data sets are treated as WIMP signals. The vertical error 
bars show the square roots of the diagonal entries of the covariance matrix^ given in Eq. (25), 
while the horizontal bars show the sizes of the windows used for estimating /i,rec('i^s,//)- The 
background ratios shown here are no background (dashed green fines), 10% (long-dotted blue 
lines), and 20% (solid red lines) background events in the analyzed data set in the background 
window of the entire experimental possible energy range. Note that, since the experimental 
maximal cut-off energy is fixed as 100 keV, for heavier input WIMP masses (m^ > 250 GeV), 
one can reconstruct the velocity distribution function only in the velocity range v ^ 300 km/s. 

It can be seen that, as shown in Figs. 1, for heavier WIMP masses (m^ > 100 GeV), 
the relatively sharper background spectrum contributes more events to low energy ranges, or, 
equivalently, to low velocity ranges. This shifts the reconstructed velocity distribution to lower 
velocities. For an input WIMP mass of 100 GeV and the background ratio of 10% (20%), the 
peak of the reconstructed velocity distribution function could be shifted by 30 (60) km/s. 

In contrast, for lighter WIMP masses {niy^ < 50 GeV), the relatively flatter background 
spectrum contributes more events to high energy /velocity ranges. This shifts the reconstructed 
velocity distribution to higher velocities. Note that, however, compared to the cases with the 
heavy WIMP masses, the reconstructed velocity distribution for light WlMPs seems not to be 
shifted to higher velocities very much. This should mainly be due to the exponential form of 
the background spectrum: its contribution to high energy ranges for light WIMPs is relatively 
not so significant as that to low ranges for heavy WIMPs (see Figs. 1). One exception should be 
the case with the input WIMP mass of 10 GeV. For this case, the WIMP scattering spectrum 
drops very sharp in the energy range between and 10 keV, while the exponential background 

''Note that, rather than the mean values, the (bounds on the) reconstructed /i,rec('i^s,ju) are always the median 
values of the simulated results. 

^For the input WIMP masses of 10 (25) GeV, the widths of the first bin have been modified to 1.5 (5) keV 
due to a kinematic maximal cut off energy discussed later. 

^Remind that, since the neighboring windows overlap, the estimates of /i by Eq. (14) at adjacent values of 
Vs^fj, are correlated. 
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Figure 3: The one-dimensional WIMP velocity distribution function reconstructed by Eq. (14) 
for six different WIMP masses: 10, 25, 50, 100, 250, and 500 GeV. The double-dotted black 
curves are the input shifted Maxwellian velocity distribution in Eq. (26). The vertical error bars 
show the square roots of the diagonal entries of the covariance matrix given in Eq. (25), while the 
horizontal bars show the sizes of the windows used for estimating fi,rec{vs,fi)- The background 
ratios shown here are no background (dashed green lines), 10% (long-dotted blue lines), and 
20% (solid red lines) background events in the analyzed data set in the background window of 
the entire experimental possible energy range. Parameters and notations are as in Figs. 2. See 
the text for further details. 
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Figure 4: As in Figs. 3, except that the constant background spectrum in Eq. (30) has been 
used. Note that the sohd red hues here indicate a background ratio of 5%. 



spectrum extends much wider to 100 keV (see Figs. 1). Thus a large part of background events 
contribute to energy ranges higher then 10 keV. However, because we reconstruct the WIMP 
velocity distribution only in the velocity range below the maximal cut-off velocity, i.e., the 
Galactic escape velocity Vesc, this leads to a kinematic maximal cut-off of the recoil energy 



Qmax,kin — o • (^^) 



a 



For a WIMP mass of 10 GeV and a ''^Ge target, it can be calculated that Qmax,kin = 11-8 keV. 
Therefore, all background events with energies larger than ~ 12 keV have actually been neglected 
in the data analysis. 

Not surprisingly, the larger the background ratio of our data set, the higher/lower the veloc- 
ities to which the reconstructed velocity distribution will be shifted. But, our simulation results 
shown in Figs. 3 indicate that, with an ~ 10% - 20% background ratio (i.e., ~ 50 - 100 events) 
in the analyzed data set of ~ 500 total events, one could in principle still reconstruct the one- 
dimensional velocity distribution function of halo WIMPs with an ~ 6.5% (for a 25 GeV WIMP 
mass, 20% background events) - ~ 38% (for a 250 GeV WIMP mass, 10% background events) 
deviation. If the mass of halo WIMPs is 0(50 GeV), the maximal acceptable background ratio 
could even be as large as ~ 40% (~ 200 events) with a deviation of only ~ 14%. 

In order to check the need of a prior knowledge about an (exact) form of the residue back- 
ground spectrum, in Figs. 4 we consider a rather extrem form for the residue background spec- 
trum, i.e., the constant spectrum introduced in Ref. [13]: 




(30) 

bg,const 

Here we show only results with a background ratio of 5% (solid red lines). It can be seen 
clearly that, although a constant background spectrum contributes (much) more events in high 
energy ranges for all six input WIMP masses^, taking into account the pretty large statistical 
uncertainty, we could at least give a rough outline of the WIMP velocity distribution for heavy 
WIMP masses {m^ > 100 GeV), or even reconstruct the distribution pretty well for light WIMPs 
("^x ~ GeV), thanks to the kinetic maximal cut-off energy (5max,kin discussed above. 



4.2 With a reconstructed WIMP mass 

In this subsection, the required WIMP mass for estimating the reconstructed points Vg^^i as well 
as the normalization constant M by Eqs. (15) and (16) has been reconstructed with other direct 
detection experiments^. Note that the statistical uncertainty on f\^rcc{vs,ii) estimated as the 
diagonal entries of the covariance matrix given in Eq. (25) must thus be modified by taking into 
account the statistical uncertainty on the reconstructed WIMP mass cr(m^) to 

2 

^'K) ■ (31) 

In Figs. 5 I show the numerical results with six different input WIMP masses, as shown in Figs. 3. 
Note that, while the vertical bars show the Icr statistical uncertainties estimated by Eq. (31), 

^Illustrations and detailed discussions about the effects of the constant form of the residue background spec- 
trum on the measured energy spectrum for different input WIMP masses can be found in Ref. [13]. 

^In order to avoid calculations of the correlations between and /i,rec(vs,p), we have assumed here that the 
two data sets using Ge detectors are independent of each other. 
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Figure 5: As in Figs. 3, except that the WIMP masses have been reconstructed by means of the 
procedure introduced in Refs. [6, 7]. Here the vertical bars show the la statistical uncertainties 
estimated by Eq. (31), while the horizontal bars show the la statistical uncertainties on the 
estimates of Vs,^ given in Eq. (15) due to the uncertainty on the reconstructed WIMP mass. See 
the text for further details. 
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Figure 6: As in Figs. 5, except tliat the constant background spectrum in Eq. (30) has been 
used. Note that the sohd red hues here indicate a background ratio of 5%. 



the horizontal bars shown here indicate the la statistical uncertainties on the estimates of Vg^^ 
given in Eq. (15) due to the uncertainty on the reconstructed WIMP mass; the statistical and 
systematic uncertainties due to estimating of Qs,fj. have been neglected here. 

It can be seen that, firstly, as shown in Figs. 2, for an input WIMP mass of 100 GeV, 
the reconstructed mass doesn't differ very much from the true value. Hence, the reconstructed 
/i,rec(t^s,/i) is approximately the same for both cases with the input/reconstructed WIMP mass. 
However, for light input masses (m^ < 100 GeV), the reconstructed /i,rec(fs,/i) with the re- 
constructed WIMP mass shift to relatively lower velocities compared to the case with the input 
(true) WIMP mass. This effect caused directly by the overestimate of the reconstructed WIMP 
mass. The coefficient a defined in Eq. (5) can be rewritten as 



This implies that, once the reconstructed WIMP mass is over- /underestimated from the real 
value, the coefficient a will thus be under-/overestimated. Consequently, v^^^ determined by 
Eq. (15) will be smaller/larger than the true values. 

Note that, firstly, this second effect, in contrast to the first one discussed in the previous sub- 
section, draws the reconstructed WIMP velocity distribution to the opposite directions; i.e., to 
lower/higher velocities if WIMPs are light/heavy. Secondly, for heavier WIMP masses, as shown 
in Figs. 2, with a small background fraction, the underestimate of the reconstructed WIMP mass 
and thus the shift of the velocity distribution function seem not to be significant. Nevertheless, 
the simulation results shown in Figs. 5 indicate that, with an ~ 5% - 10% background ratio (i.e., 
~ 25 - 50 events) in the analyzed data sets of ~ 500 total events, one could in principle still 
reconstruct the one-dimensional velocity distribution function of halo WIMPs with an ~ 7% 
(for 25 GeV WIMPs, 10% backgrounds) - ~ 16% (for 250 GeV WIMPs, 5% backgrounds) de- 
viation^°. If the mass of WIMPs is light (m^ < 100 GeV), the maximal acceptable background 
ratio could even be as large as ~ 20% (~ 100 events) with a deviation of only ~ 9%. 

In Figs. 6 we again use the constant spectrum for residue backgrounds and the WIMP mass 
has been reconstructed by using other mixed data sets. For this shown in Ref. [13], the 

WIMP mass would be overestimated for all input masses. And, consequently, the reconstructed 
WIMP velocity distribution for all six input masses are shifted to lower velocity ranges. Never- 
theless, as for the case with a known WIMP mass shown in Figs. 4, data sets with background 
fractions of < 5% could in principle be used to at least give a rough outline of the WIMP 
velocity distribution (for > 100 GeV), or even reconstruct the distribution pretty well (for 
< 100 GeV). 

Finally, we rise the expected event number in each experiment a factor of 10, to 5,000 events 
totally. The experimental maximal cut-off energies for WIMP signals and background windows 
have also been extended to 150 keV. Nine bins have been used^^ and up to four bins have been 
combined to a window. Since with 2 x 5,000 events we could in principle determine the WIMP 
mass with an uncertainty of ^ 5%, I show only the results of the reconstructed WIMP velocity 
distribution function with the input WIMP mass in Figs. 7. It can be seen clearly that, by 
using a data set of C(5,000) events with a maximal background ratio of < 5% (C(250) events), 
we could in principle reconstruct the WIMP velocity distribution function in the velocity range 

^•^Remind that, as shown in the lower frame of Figs. 2, with two data sets of ~ 500 total events each and a 
background ratio of ^ 10%, the WIMP mass could in principle be reconstructed with a statistical uncertainty of 
~ 10% (for 25 GeV WIMPs) - ~ 25% (for 250 GeV WIMPs). 

"For the input WIMP masses of 10/25/50 GeV, the widths of the first bin have been set as 1.5/2.5/2.5 keV. 
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Figure 7: As in Figs. 3, except that the expected number of total events in each experiment 
has been risen to 5,000 and the experimental maximal cut-off energies for WIMP signals and 
background windows have been extended to 150 keV. Note that the solid red lines here indicate 
a background ratio of 5%. 



V < e)(500 km/s) with a deviation of < 6% (for a WIMP mass of 100 GeV). Once WIMPs are 
light (m^ < 0(50 GeV)), this deviation could even be reduced to < 2.5%. 

5 Summary and conclusions 

In this paper we reexamine the model-independent data analysis method introduced in Ref. [5] 
for the reconstruction of the one-dimensional velocity distribution function of Weakly Inter- 
acting Massive Particles from data (measured recoil energies) of direct Dark Matter detection 
experiments directly by taking into account a fraction of residue background events, which pass 
all discrimination criteria and then mix with other real WlMP-induced events in the analyzed 
data sets. This method requires neither prior knowledge about the WIMP scattering spectrum 
nor about different possible background spectra; the unique needed information is the recoil 
energies recorded in direct detection experiments and (eventually) the mass of incident WIMPs. 

For the mass of incident WIMPs required as an unique input information in this method, 
we first assumed that it could be known precisely with a negligible uncertainty from other 
(e.g., collider) experiments. Our simulations show that, assuming an exponential form for the 
residue background spectrum, with a data set of ~ 500 total events, and a background ratio 
of ~ 10% - 20%, the WIMP velocity distribution function could in principle be reconstructed 
with an - 6.5% (for a 25 GeV WIMP mass, 20% background events) - ~ 38% (for a 250 GeV 
WIMP mass, 10% background events) deviation. If the WIMP mass is (9(50 GeV), the maximal 
acceptable background ratio could be risen to ~ 40% with a deviation of only ~ 14%. 

Moreover, for lighter/heavier WIMP masses, since the relatively flatter/sharper background 
spectrum could contribute relatively more events to high/low energy ranges, the reconstructed 
velocity distribution could therefore be shifted to higher/lower velocities. However, since for 
(very) light WIMPs (m^ < 40 GeV), the kinematic maximal cut-off of the recoil energy due to 
the Galactic escape velocity is (much) lower than the experimental cut-off, a (large) fraction of 
background events in high energy ranges could thus in practice be neglected, and the shift could 
not be very significant for WIMPs lighter than ~ 50 GeV. 

Since a model-independent method for determining the WIMP mass by using two experi- 
mental data sets with two different target nuclei has also been developed [6, 7], we considered 
in this paper also the case that the velocity distribution function is reconstructed with a re- 
constructed WIMP mass from other direct detection experiments. Our simulations show that, 
since hghter/heavier WIMP masses could be over-/underestimated by using this method with 
background-mixed data sets [13], the reconstructed points of the velocity distribution would 
thus be shifted to lower/higher velocities, the opposite direction of the shift due purely to the 
background contribution to high/low energy ranges. Although this second effect shifts the re- 
constructed velocity distribution (much) more strongly, with ~ 5% - 10% background events 
mixed in the analyzed data sets, the WIMP velocity distribution function could in principle 
still be reconstructed with an ~ 7% (for 25 GeV WIMPs, 10% backgrounds) - ~ 16% (for 250 
GeV WIMPs, 5% backgrounds) deviation. If the WIMP mass is < (9(100 GeV), the maximal 
acceptable background ratio could even be as large as ~ 20% with a deviation of only ~ 9%. 

Furthermore, in order to check the need of a prior knowledge about an (exact) form of 
the residue background spectrum, a constant spectrum for residue backgrounds has also been 
considered. Since the WIMP mass would always be overestimated [13], the reconstructed WIMP 
velocity distribution could thus be (strongly) shifted to lower velocity ranges. However, data sets 
with background fractions of < 5% could in principle be used to at least give a rough outline 
of the WIMP velocity distribution (for > 100 GeV), or even reconstruct the distribution 



pretty well (for < 100 GeV). 

Finally, for rather ncxt-to-next generation detectors, we considered also the case of 5,000 
total events and extended the maximal experimental cut-off energies for WIMP signals and 
backgrounds to 150 keV. Assuming a maximal background ratio of < 5%, our results show 
that the WIMP velocity distribution function could in principle be reconstructed in the velocity 
range v < (9(500 km/s) with a deviation of < 6% (for a WIMP mass of 100 GeV). Once 
WIMPs are light (m^ < (9(50 GeV)), this deviation could even be reduced to < 2.5%. 

In summary, as the second part of the study of the effects of residue background events 
in direct Dark Matter detection experiments, we considered the reconstruction of the velocity 
distribution function of halo WIMPs. Our results show that, with projected experiments using 
next-generation detectors with 10~^ to 10~^^ pb sensitivities [20, 21, 10, 22] and < 10""^ back- 
ground rejection ability [9, 11, 12, 8], once one or more experiments with different target nuclei 
could accumulate a few hundreds events (in one experiment), we could in principle at least give 
a rough outline of the WIMP velocity distribution function, e.g., an approximate estimate of 
the location of its peak, even though there could be some background events mixed in our data 
sets for the analysis. After that, by means of increased number of observed (WIMP-induced) 
events and improved background discrimination techniques [9, 11], the shape and properties of 
the velocity distribution of halo Dark Matter should be understood more clearly. 
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A Formulae needed in Sec. 2 

Here I list all formulae needed for the model-independent method described in Sec. 2. Detailed 
derivations and discussions can be found in Refs. [5, 14]. 

First, by using the standard Gaussian error propagation, the expression for the error on the 
logarithmic slope kn can be given from Eq. (11) directly as 



sinh(A;„6„/2) 



(7^ (Q - Qn\n) , 



(Al) 



with 



(7^ (Q - Qn\n) 



-Q-Q n 
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For replacing the "bin" quantities by "window" quantities, one needs the covariance matrix for 
Q ~ Q)j.\n, which follows directly from the definition (23): 

cov (C^ - Q^\^,Q - Q^\J^ 



N N 



E (Qln - QIm) {Q\n - Q\.) + A^>' (Q - Qn\n) 



(A3) 



Note that, firstly, /i < v has been assumed here and the covariance matrix is, of course, sym- 
metric. Secondly, the sum is understood to vanish if the two windows u do not overlap, i.e., 
if < rii/-. Moreover, from Eq. (24), we can get 



cov (r^,r,.) 



where /i < v has again been taken. And the mixed covariance matrix can be given by 



COV (r^, Q - (5^1^) = — J2 (Q\n - Q\l^ 



(A4) 



(A5) 



Note here that this sub-matrix is not symmetric under the exchange of // and v. In the definition 
of n_ and n+ we therefore have to distinguish two cases: 



U— — Til,— , Tl-\- — Tl/j,^, 



ri- 



n 



if // < 1/ ; 
if > 1/ . 



(A6) 



As before, the sum in Eq. (A5) is understood to vanish if n_ > n+. 

Furthermore, the covariance matrices involving the estimators of the logarithmic slopes /c^, 
estimated by Eq. (11) with replacing n ^ /i, can be given from Eq. (Al) as 



COV (/c^, k^) = k^k^ < 1 - 



smh{k^h^/2) 



-1 



svah.{kJ)^/2) 



X 



coy {Q - Q^\^,Q - QX) , 



and 



COV (r,,, k^) = kl\l- 
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